Implicit Temporal Differences

نویسندگان

  • Aviv Tamar
  • Panos Toulis
  • Shie Mannor
  • Edoardo M. Airoldi
چکیده

In reinforcement learning, the TD(λ) algorithm is a fundamental policy evaluation method with an efficient online implementation that is suitable for large-scale problems. One practical drawback of TD(λ) is its sensitivity to the choice of the step-size. It is an empirically well-known fact that a large step-size leads to fast convergence, at the cost of higher variance and risk of instability. In this work, we introduce the implicit TD(λ) algorithm which has the same function and computational cost as TD(λ), but is significantly more stable. We provide a theoretical explanation of this stability and an empirical evaluation of implicit TD(λ) on typical benchmark tasks. Our results show that implicit TD(λ) outperforms standard TD(λ) and a state-of-the-art method that automatically tunes the step-size, and thus shows promise for wide applicability.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Direct and Indirect Timing Functions in Unilateral Hemispheric Lesions

Introduction: The neural substrates of temporal processing are not still fully known. The majority of interval timing studies have dealt with this subject in the context of “Explicit timing” (computing the time intervals explicitly). The hypothesis “Implicit timing” (implicitly using temporal processing to improve function) has also proposed. This lesion study addressed explicit and implicit ti...

متن کامل

Implicit perceptual-motor skill learning in mild cognitive impairment and Parkinson's disease.

OBJECTIVE Implicit skill learning is hypothesized to depend on nondeclarative memory that operates independent of the medial temporal lobe (MTL) memory system and instead depends on cortico striatal circuits between the basal ganglia and cortical areas supporting motor function and planning. Research with the Serial Reaction Time (SRT) task suggests that patients with memory disorders due to MT...

متن کامل

Right Hand Preference in Implicit Motor Learning in Children with High-Functioning Autism and Asperger Syndrome

Objectives: Cerebral hemispheres functioning have been found to be abnormal in children with ASD. The role of lateralization in implicit and explicit motor learning has received little attention in ASD researches. The main goal of this study is investigating the differences between two hands implicit and explicit motor learning in children with ASD and typical matched group. Methods: In the ...

متن کامل

Do Implicit and Explicit Measures of the Sense of Agency Measure the Same Thing?

The sense of agency (SoA) refers to perceived causality of the self, i.e. the feeling of causing something to happen. The SoA has been probed using a variety of explicit and implicit measures. Explicit measures include rating scales and questionnaires. Implicit measures, which include sensory attenuation and temporal binding, use perceptual differences between self- and externally generated sti...

متن کامل

From past to future: Temporal self-continuity across the life span.

Although perceived continuity with one's future self has attracted increasing research interest, age differences in this phenomenon remain poorly understood. The present study is the first to simultaneously examine past and future self-continuity across multiple temporal distances using both explicit and implicit measures and controlling for a range of theoretically implicated covariates in an ...

متن کامل

Medial temporal lobe-dependent repetition suppression and enhancement due to implicit vs. explicit processing of individual repeated search displays

Using visual search, functional magnetic resonance imaging (fMRI) and patient studies have demonstrated that medial temporal lobe (MTL) structures differentiate repeated from novel displays-even when observers are unaware of display repetitions. This suggests a role for MTL in both explicit and, importantly, implicit learning of repeated sensory information (Greene et al., 2007). However, recen...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1412.6734  شماره 

صفحات  -

تاریخ انتشار 2014